feat: EP0 mouse/kbd polling, CC=12 investigation (Parallels xHCI)#239
Open
feat: EP0 mouse/kbd polling, CC=12 investigation (Parallels xHCI)#239
Conversation
Systematic investigation of CC=12 (Endpoint Not Enabled) on Parallels ARM64 virtual xHCI interrupt endpoints, with working EP0 GET_REPORT polling for both keyboard and mouse. ## Findings CC=12 on interrupt IN is a fundamental Parallels virtual xHC limitation: - Parallels proactively generates CC=12 Transfer Events after re-ConfigureEndpoint (before any TRBs are queued), signaling that interrupt IN transfers are not supported - This persists regardless of command sequence: BSR=0/1, bandwidth dance enabled/disabled, BEI flag, matching Linux byte-for-byte - EP0 control transfers work reliably; interrupt IN never completes - Linux keyboard/mouse works via Parallels Tools injection, not xHCI interrupt IN (Linux ftrace shows no interrupt TRB completions in trace) ## Working solution: EP0 GET_REPORT polling Both keyboard and mouse input work via EP0 GET_REPORT control transfers at 4 Hz (limited by Parallels virtual USB device timing, not our polling interval): - gr=N gk=N mr=N mk=N with ge=me=0 (zero errors) verified in testing - Ring recycling (StopEndpoint + SetTRDequeuePointer) for EP0 rings since Parallels does not follow Link TRBs on transfer rings ## Changes xhci.rs: - Mouse EP0 GET_REPORT polling: SET_PROTOCOL(boot) during init, then 4 Hz boot-protocol GET_REPORT via EP0 with ring recycling - EP0 ring recycling for keyboard (every ~84 polls) and mouse - Bandwidth dance (SKIP_BW_DANCE=false): matches Linux's xhci_check_bandwidth() sequence with 3x ConfigureEndpoint + 2x StopRing per slot — confirmed matching Linux ftrace byte-for-byte - BSR=0 confirmed correct (BSR=1 causes CC=19 Context State Error) - Added EP0_MOUSE_POLL_STATE, MOUSE_CTRL_DATA_BUF, queue_ep0_mouse_get_report() - Diagnostics: DIAG_DOORBELL_EP_STATE, DMA buffer address logging, endpoint state read after SET_CONFIG and at doorbell time - Polling interval: % 10 (20 Hz attempt; bottleneck is Parallels ~4 Hz) timer_interrupt.rs: - Heartbeat counters: mr= mk= me= for mouse EP0 polling status Co-Authored-By: Ryan Breen <ryan@ryanbreen.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…12 diagnostics - Add second mouse interface (DCI=5) support: MOUSE2_REPORT_BUF, MSI_MOUSE2_NEEDS_REQUEUE, NEEDS_RESET_MOUSE2, mouse_nkro_endpoint field in XhciState, full poll/reset/requeue paths - Add 4th HID transfer ring (NUM_TRANSFER_RINGS = MAX_SLOTS + 4) for mouse2 - Remove EP0 GET_REPORT polling workaround (queue_ep0_get_report, queue_ep0_mouse_get_report, EP0_POLL_STATE, EP0_MOUSE_POLL_STATE, MOUSE_CTRL_DATA_BUF) — switching to pure interrupt IN - Expand KBD_REPORT_BUF and MOUSE_REPORT_BUF to 64 bytes for safety margin - SKIP_BW_DANCE=true (diagnostic): Linux ftrace confirmed bandwidth dance IS performed; isolating whether batch ConfigEP alone enables endpoints on Parallels virtual xHC - MINIMAL_INIT=true (diagnostic): confirmed CC=12 is in xHCI setup, not HID class requests - Update bandwidth dance comments: Linux uses DIFFERENT Input Context address for each per-EP re-ConfigEP; drop+re-add flags per-EP; only target EP context populated (not all) - Fix ring comment: four HID rings indexed kbd-boot/mouse/kbd-NKRO/mouse2 CC=12 (Endpoint Not Enabled) on interrupt IN endpoints remains unresolved. Next: compare endpoint context state before vs after SET_CONFIGURATION to determine if virtual xHC resets endpoint state on SET_CONFIG receipt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CC=12 (Endpoint Not Enabled) from Parallels virtual xHC halts interrupt IN endpoints immediately on each poll when no key is pressed. Three changes bring the reset rate from 200/s down to ~10/s: 1. wait_for_command_completion CC=12 path: route to MSI_*_NEEDS_REQUEUE instead of NEEDS_RESET_*. This breaks the cascade where one endpoint's reset command wait sets NEEDS_RESET_* for the other endpoint's CC=12 event, causing both endpoints to reset on the same poll tick. 2. RESET_INTERVAL_TICKS (20 ticks = 100ms at 200Hz) + per-endpoint last-reset poll counters. NEEDS_RESET_* is deferred until the minimum interval elapses, capping resets at 10/s per endpoint. 3. CC=12 in handle_interrupt and poll_hid_events routes to NEEDS_RESET_* (not MSI_*_NEEDS_REQUEUE) so Halted endpoints get proper Reset Endpoint recovery rather than a doorbell that Halted endpoints silently ignore. Verified on Parallels Desktop ARM64 (NEC uPD720200 virtual xHC): - er drops from 200/s → 10/s (rate limited) - xe tracks er 1:1 (one error per reset cycle, not compounding) - c1=0x00020502 confirms CC=12 halts NKRO DCI=5 (state=2=Halted) - ra=0x00020303 confirms Reset Endpoint leaves boot DCI=3 in Stopped (3) - System stable for 100s with no panics or cascades Co-Authored-By: Ryan Breen <ryan@ryanbreen.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…output Remove 124 serial_println! calls from the xHCI driver. All diagnostics now go through the lock-free xhci_trace ring buffer (no locks, no allocations, safe in interrupt context). Only xhci_trace_dump() retains serial output for post-init trace extraction. Key changes: - xhci.rs: Remove all serial_println, replace key milestones with xhci_trace_note() calls. Fix Mult field to 0 per xHCI spec §6.2.3 (was incorrectly 1 for interrupt endpoints). Add USBLEGSUP handoff in Extended Capabilities parsing. Revert No-Op TRB diagnostic back to Normal TRB. - timer_interrupt.rs: Trim heartbeat to essential fields only (time, ctx switches, syscalls, xhci errors, kbd events, first CC). - pci.rs: Add enable_intx() and disable_msi() methods. - drivers/mod.rs: Remove raw_serial_str debug breadcrumbs. - build.rs: Add build ID based on timestamp for stale-build detection. - main_aarch64.rs: Print BUILD_ID in boot banner. - build-efi.sh: Prefer mformat over newfs_msdos for FAT32 on macOS. - deploy-to-vm.sh: Rewrite with prl_disk_tool HDD creation, serial log truncation, NVRAM cleanup, and --boot flag. - CLAUDE.md: Document Parallels VM workflow and restart protocol. - Add Linux xHCI ftrace reference and trace analysis scripts. CC=12 (ENDPOINT_NOT_ENABLED) on interrupt IN endpoints remains unsolved. Co-Authored-By: Ryan Breen <ryanbreen@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detailed document covering the full state of the xHCI CC=12 (ENDPOINT_NOT_ENABLED) investigation across 26+ test iterations. Includes verified-correct items, remaining theories, code structure map, build/test instructions, and trace infrastructure usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
kernel/src/drivers/usb/xhci.rsSET_PROTOCOL(boot)during device init, then 4 Hz GET_REPORT via EP0 control pipe with ring recyclingSKIP_BW_DANCE=false): matches Linux'sxhci_check_bandwidth()with 3× ConfigureEndpoint + 2× StopRing per slotDIAG_DOORBELL_EP_STATE, DMA buffer address logging, endpoint state diagnostics% 10(20 Hz attempt; bottleneck is Parallels' ~250ms USB device timing, effective rate stays 4 Hz)kernel/src/arch_impl/aarch64/timer_interrupt.rsmr= mk= me=counters for mouse EP0 polling statusTest plan
gr=N gk=N ge=0sustained at 4 Hz with zero errorsmr=N mk=N me=0sustained at 4 Hz with zero errors🤖 Generated with Claude Code